---
title: Averaging Over Varieties of Variance; Evidence from Survey Experiments on Immigration

# to produce blinded version set to 1
blinded: 0

authors: 
- name: Mohit Negi
  thanks: The authors would like to thank the students in our class who generously provided feeback to our paper (Kacie Dragan and Yi Ning Chang), the teaching staff of GOV2001 (Professor Gary King and TFs Georgina Evans, Shusei Eshima, Sascha Riaz), and the authors of the original paper who kindly answered to our requests for clarifications.
  affiliation: NM College of Commerce & Economics, Mumbai
  
- name: Emma Pendl-Robinson
  affiliation: Junior Data Scientist, Mathematica Policy Research
  
- name: Marco Scipioni
  affiliation: Project Officer, European Commission

abstract: |
  Recent literature on immigration attitudes has focused on estimating via survey experiments the determinants of such attitudes regarding immigration. 
  Considering the degree to which opinions are polarized on immigration, while knowing the overall public opinions on specific questions is of substantive interest, so is the appreciation of the amount of variation that we can observe.
  We expand upon a recently published survey experiment by first devising a multilevel model accounting for variation between both respondents and countries. 
  Simulating from such multilevel model fails to retrieve significant differences between high and low skilled immigrants as in the original analysis of the survey. 
  We model the spread at the country level and find that skill level of the immigrant has a small but significant effect on the spread in several countries and that the difference in such a spread between countries depends on the features of the immigrant.

bibliography: bibliography.bib
output: rticles::asa_article
---


\section{Introduction}
\label{sec:intro}

Research on attitudes towards immigration and their role in shaping contemporary politics and society has greatly expanded over the past decades.
Past studies have captured preferences in public opinions regarding immigration (@HainmuellerJens2015THAI; @BansakK2016Heha; @FordRobert2020TCCP), tested the stability of attitudes over time (@DanielStockemer2019TRCI; @KustovAlexanderTSoI), and investigated their role in shaping political outcomes (@GrandeEdgar2018PiiW; @FordRobert2020TCCP; @StockemerDaniel2018IPot; @Green-PedersenChristoffer2017AhtI).
In this area, experimental methods have been applied mainly in the form of survey experiments to capture the effect of treatments on respondents' attitudes.
A recent article titled, "Economic and cultural drivers of immigrant support worldwide," by @ValentinoNicholasA2019EaCD is remarkable because of the numerous treatments deployed in the survey and its cross-national scope.
The paper aims to explain the causal determinants of public opposition to immigration in 11 countries with a series of survey treatments relative to type of employment, racial composition, family status, and country of origin.

We extend the original paper by @ValentinoNicholasA2019EaCD in two main ways. 
First, we explore the possibility of fully exploiting the nested structure of the data. 
In their analysis, after having clustered the repeated observations for each survey respondent, the authors either analysed the entire sample together or they analyse each country separately. 
Such strategy prevents them to explore how information relative to each grouping variable is captured in the model.
Considering that the article's cross-national scope is one of its strength, being able to simultaneously analyse effects between and within countries is preferable. 
We provide a rough description of the importance to include each of the clustering levels in a single multilevel model by calculating the intraclass correlation. 
We also propagate uncertainty during estimation via simulating from our multilevel model. 
The results of the simulation  - while subject to assumptions and dependent upon several methodological choices - point towards a more cautionary approach towards one of the main result in the original paper, namely higher support towards highly skilled immigrants. 

Second, we expand the paper by shifting the question of interest from _relative to measures of central tendency_ to _dispersion_. 
Much of the literature on attitudes towards immigration tends to present central tendency measures - e.g. means or median - of specific opinions regarding immigrants, such as views on skills levels of migrants, or towards family reunification, or whether immigrants represent a challenge to welfare states. 
We believe that the analysis should also concentrate on how dispersed the opinions within countries are. 
While it is informative to know that, for instance, views on immigrants contributions to the welfare state may be more positive in country *A* rather than country *B*, and what could be the reasons for that, it is equally important to appreciate what is the relative degree of dispersion (or consensus) within a country.
If measurements of central tendencies are relatively close, but variance in the data is large, this should alert the reader about the substantive implications of differences-in-means estimators.

We study this dispersion between respondents within countries by measuring the "spread" as the standard deviation in the data.
Our results indicate that while the spread between countries is smaller than the spread among respondents, the former is non-negligible. 
This suggests that a model that includes all nesting levels - in this case, observations nested within countries - is more informative than a model with either no country identifier or individual regressions for each state.   
Further, following best practices in the literature (@GelmanAndrew2006DAUR), we propagate uncertainty during estimation by simulating from our multilevel model. 
As a consequence, we were not able to retrieve significant changes within countries depending on immigrants' skills level, and between countries among identical treatment assignment. 

Modeling the spread shows that the countries in our sample can be roughly split into two groups. 
A first, South Korea (KR) and Japan (JP) show a relative smaller spread in our simulations.
A second, larger group of Western countries display larger but similar levels of spread. 
Substantively, our simulated standard deviations range from about 0.26 to about 0.35, on a dependent variable ranging from 0 to 1. 
This means the spread conditional on the treatments in the paper is rather large, suggesting that the public opinions across all countries in the sample are not extremely cohesive. 
This makes intuitively sense, considering how polarised public opinions are on immigration, and thus the expectations of a firmly held opinions on such divisive issue is perhaps unrealistic.
However, this should be better regarded as a comparative matter, in as much as judgements on how cohesive public opinions should be checked across domains.
Thus, future research could consider applying an explicit modeling of variance across public policy domains to test for significant differences. 

\section{The Quantities Of Interest (QOI)}
\label{sec:QOI}

In the literature on attitudes towards immigration, several experiments have striven to identify the types of immigrants that a given population may prefer, usually through conjoint experiments. 
These types of studies were conducted in the United States (US), and the European Union (EU).
In the US @HainmuellerJens2015THAI investigate, 'whether various subgroups of Americans respond differently to specific immigrant attributes.'
They report that, to their surprise, ‘our results uncover a sweeping consensus across different groups about which types of immigrants to admit.’
This ‘hidden American immigration consensus’ has important policy consequences according to @HainmuellerJens2015THAI, in as much as ‘mass-level consensus indicates that any disagreements among the mass public are more likely to be over how many immigrants to admit or how to handle immigrants already here than over whom to admit.’
In the EU, @BansakK2016Heha observe that the probability of accepting asylum seekers depends on the applicants’ features, such as whether they are male or female, or where do they come from, or their previous occupation.
In particular, both studies use conjoint experiments, which 'ask subjects to evaluate hypothetical profiles with multiple, randomly varied attributes and are widely used [...] to measure preferences and the relative importance of structural determinants of multidimensional decision-making’ (@BansakK2016Heha). 
With an experiment nested in the most recent ESS wave, @FordRobert2020TCCP test whether opposition to immigration is more consistently connected with immigrants’ skills rather than their origin.
They find systematic evidence for a skills premium, meaning that immigrants with higher skills are generally more accepted than those with lower ones, but do not find the same widespread opposition based on a differentiation of immigrants as coming from a European or non-European country.

The key point of the present paper is that all these survey experiments-based studies use measures of central tendencies - such as average effects of treatments - as their key quantities of interest. 
Our proposal in this paper is to shift the focus towards a different quantity of interest, namely the variance. 
In the context of mounting divisions and polarisation regarding immigration in many countries (see @2019AtWM), knowing the average opinion regarding immigration is both informative and of substantive interest, as it is testing whether being more informed regarding immigration would soften opposition, or whether specific immigrants' traits are connected with more scepticism.
Besides all this, if opinions are widely different and fragmented within countries, public policies would benefit from also appreciating the degree to which opinions vary, and whether such variation dwindles when conditioned upon a (series of) treatment.

```{r, libraries, message=FALSE, warning=FALSE, include=FALSE}
# read in libraries
library(tidyverse)
library(haven) # read_dta()
library(data.table)
library(sjmisc) # frq
library(sjlabelled) # get_labels()
library(sjPlot) # plot_model()
library(lme4) # lmer function
library(magrittr)
library(plm)
library(stargazer)
library(MASS)
library(ggplot2)
library(reshape2)
library(gridExtra)
library(grid)
library(broom.mixed) # https://cran.r-project.org/web/packages/broom.mixed/broom.mixed.pdf
library(arm) # https://cran.r-project.org/web/packages/arm/arm.pdf ; se.coef, se.fixef, se.ranef, sigma.hat
library(kableExtra)
library(modelsummary) # https://vincentarelbundock.github.io/modelsummary/

theme_set(theme_bw())

# country code to country name key
countryt_key <- data.frame(countryt = c("AU", "CA", "CH", "DK", "ES", "FR", "JP",
                                        "KR", "NO", "UK", "US"),
                           country_name = c("Australia", "Canada", "Switzerland",
                                            "Denmark", "Spain", "France", "Japan", 
                                            "S.Korea", "Norway", "UK", "USA"))
```


\section{Data}
\label{sec:data}


```{r, message=FALSE, warning=FALSE, include=FALSE}
# load data and filter
dt <- haven::read_dta("../../data/raw/imm.bjpols.dta") %>%
  as.data.table() %>%
  filter(!is.na(uid)) %>% # rm rows with no uid
  filter(!is.na(support)) # rm where dependent variable is not provided

# creating new variables
dt %<>%
  mutate(candf = factor(cand), # create factored versions of treatment variables  
         Job_f = factor(treat_hstat, levels = c(0, 1), labels = get_labels(treat_hstat)),
         Complexion_f = factor(treat_drk, levels = c(0, 1), labels = get_labels(treat_drk)),
         ME_f = factor(treat_nat_me, levels = c(0, 1), labels = get_labels(treat_nat_me)),
         Children_f = factor(treat_kids, levels = c(0, 1), labels = get_labels(treat_kids)),
         Gender_f = factor(treat_gender, levels = c(0, 1), labels = get_labels(treat_gender))) %>%
  mutate(female = factor(gender, levels = c(1,0), labels = c("female", "male"))) %>% # recode gender var for to ease interpretation
  left_join(countryt_key, by = "countryt") # add country name


# factoring variables
# list of factor variables
fact_vars <- c("id", 'uid', 'country', "countryt", 'cand', 'insample', 
               'gender', 'education2', 'occupation2', "inc2", 'inc3',
               'treat_hstat', 'treat_drk', 'treat_nat_me', 'treat_kids', 'treat_gender')

# Recoding vars as factor
dt %<>% mutate(across(fact_vars, as.factor)) 
```

The *average causal effects* ^[According to @ImaiKosuke2017Qss, 'Researchers want to estimate the unknown value of a *quantity of interest* using observed data. 
We refer to the quantity of interest as a *parameter*, and the method to compute its estimate as an *estimator*'.
Imai provides a concrete example which is close to our paper. 
He states that, 'in randomised controlled trials, the average outcome difference between the treatment and control groups represents the estimator for the average causal effect, which is our parameter' - in other words, the QOI (@ImaiKosuke2017Qss). ] in @ValentinoNicholasA2019EaCD article refer to the effects of the following treatments: 'varying the skill level (high and low), family status (single or married with children), national origin (Middle Eastern versus Latin American/Asian) and racial phenotype (Afrocentric, non-Afrocentric) of an individual immigrant [shown in the vignettes]'.

The authors estimate such causal effects in two main ways: one model including all countries in the sample and then 11 separate country-specific regressions ^[ Following the same presentation as the original authors of the paper, we abbreviate the 11 countries - Australia, Canada, Switzerland, Denmark, Spain, France, Japan, S.Korea, Norway, UK, USA - as following: AU, CA, CH, DK, ES, FR, JP, KR, NO, UK, US)].

The effect of the treatments is measured on a dependent variable - *support* - which is a composite indicator of three different questions: *admission*, *length of stay*, and *naturalisation*. The composite variable ranges from $0$ to $1$.

The coding and aggregation decisions made by @ValentinoNicholasA2019EaCD may be regarded as problematic by some.
Having coded *'can't say'* and *'undecided'* as 1 on a 0-2 scale (respectively, for the *admission* and *naturalisation* variables), and then aggregating them into a composite indicator, may skew the results in unpredictable directions. 
Perhaps it would have been better to check the results by having the same model run on these variables without these values (i.e. treat them as *don't know* and drop them, which would have turned all models into logistic regressions). 
Alternatively, if one consider these answers - again, the *'can't say'* and *'undecided'* - to be substantive answers (for perfectly justifiable reasons), it would have been better to treat these values as baselines within three multinomial logits (one for each variable), not assuming that they can be scaled in the middle as 1s. 
This is particularly concerning because then they are summed up and rescaled, thus treating them as ratio data/variables (which they are not).
Moreover, we have no clue as to the distribution of these values in the original variables - in other words, how pervasive the  *'can't say'* and *'undecided'* answers were. 
We asked the authors for the original variables and they denied access to the data^[Citing directly from their reply: 'They also are part of the country-specific files, which were gathered independently by the different researchers (using different grants). We don’t have permission to redistribute the individual surveys'.]. 

Figure 1 below offer a quick glimpse at the distribution of the dependent variable in @ValentinoNicholasA2019EaCD original article. 
There is ample variation in the data when we unpack support by treatment and control groups, besides the type of treatment and country.
For the skill-level treatment we see substantial differences between the medians of treated and control group for all states, whereas same does not apply to family-related treatments. 
At first sight, variance seems to be more country related rather than treatment-based. 
In other words, countries such as South Korea (KR) and Japan (JP) seem to have consistently low variance across treatments, whereas Denmark (DK) has relatively high variance disregarding the specific treatment being applied. 

```{r echo=FALSE, fig.height=4.95, fig.width=6}
dt_melt <- melt(dt, 
           measure.vars = c("treat_hstat", "treat_kids", "treat_drk", "treat_nat_me"),
           variable.name = "treatments", value.name = "value")

# this needs polishing, but key message is here
dt_melt %>% 
  dplyr::mutate(Treatment = factor(value, levels = c(0,1), labels = c("Control", "Treated")),
                treatments = factor(treatments, 
                                    levels = c("treat_hstat", "treat_kids", "treat_drk", "treat_nat_me"),
                                    labels = c("Job Status", "Family Status", "Complexion", "Middle Eastern"))) %>%
  ggplot() +
  geom_boxplot(aes(x = Treatment, 
                   y = support, 
                   fill = Treatment), size= .2,
               outlier.shape =NA) +
  facet_grid(countryt ~ treatments) +
  coord_flip() +
  labs(subtitle = str_wrap("Figure 1 : Distribution of outcome variable conditional upon treatment, by country", width = 75)) +
  theme(legend.position = "top",
        text = element_text(size = 10),
        axis.title.y = element_blank(),
        axis.text.x = element_text(angle = 90))
```

\section{Multilevel Model - Taking into account the nested structure of the data}
\label{sec:ex1-mlm}

A *mixed model* - as termed by the authors - accounts for multiple observations per subject. 
These multiple observations per subjects are likely to be correlated, as belonging to the same subject. 
This makes one of the assumptions of an OLS model - namely independent observations - likely to be violated (@AgrestiAlan2018Smft).

Indeed, the underlying idea of a multi-level model is that we take out of the error term another constant, which is connected to the clustering in the data (in the case at hand, repeated measurements of the same individual - so, it captures an additional source of variance at the individual level). 

The research design of the article foresees not only repeated observations for each interviewees, but also each interviewees nested within 11 countries. 
One of the authors' model specification envisages all countries pooled together (which the authors termed *all countries* in the graphs, and *all* in the regression tables). 
A possible explanation for this is that @ValentinoNicholasA2019EaCD wanted to understand if there were global trends that persisted regardless of country-specific features.

In this section, we couple the replication of the original model by @ValentinoNicholasA2019EaCD (termed *all* in the table below) with another model (labelled *MLM* for multilevel model, see @GelmanAndrew2006DAUR) which acknowledges the fact that we not only have repeated measurements for each subject, but also these subjects are nested within different countries (this is captured in the table below as the difference between the total number of *observations*, compared to the *group level* observations). 

The two models are almost identical in both coefficients and standard errors, with the exception of the intercept, which has a much larger standard error in the second specification.
In the *MLM* model, we have the same slopes for the two levels (individuals and country), as these are posited by the model to be fixed effects, but we see that - if needed - we are able to calculate the regression line at both levels.
What is perhaps more interesting is that we are able to look at the differences in the estimated fixed effects in each of the two models with predictors.
But we are now able to appreciate that what's included in the intercept of the nested model is actually considerable variation among states. 
Indeed, the intercept goes from a minimum of 0.44 in United Kingdom (UK), to a maximum of 0.71 in South Korea (KR). 
In short, we do not have to model a separate regression country for each country, and we can keep in a single model variation due to the respondents and variation due to country.

Besides changes in coefficients and standard errors, there are drops in both the Akaike information criterion (AIC) and Bayesian information criterion (BIC), a signal of the improvements in model fit of the second specification. 

The *residual standard deviation* and the *level 1 standard deviation* barely change between the two specifications.  
However, the different model specification enables us to capture the variance related to country level grouping, which is the the *Level 2 sd* in the table. 

```{r, echo=FALSE, message=FALSE, warning=FALSE, results='asis'}
# here we filter the data to treat_gender is 0, as the authors did, for consistency reasons
dt_gender0 <- dt %>% filter(treat_gender == 0) %>% as.data.frame()

# Varying-intercept model with an individual-level predictor
## Same model as Valentino's, but with REML = T
val_lmer <- lmer(support ~ Job_f + Children_f + ME_f + Complexion_f + candf + (1|uid),
                 data = dt_gender0, weights = wt_country, REML = T)

## Fully nested model: repeated obs nested within countries
nest_lmer <- lmer(support ~ Job_f + Children_f + ME_f + Complexion_f + candf + (1 | countryt / uid),
                  data = dt_gender0, weights = wt_country, REML = T)
# groups_level1 <- c(length(unique(val_lmer@flist[["uid"]])), length(unique(nest_lmer@flist[["uid:countryt"]])))   
# groups_level2 <- c(NA, length(unique(nest_lmer@flist[["countryt"]])))   

# regression table
stargazer(val_lmer, nest_lmer,
          add.lines = list(c("Residual sd", round(as.numeric(c(sigma.hat(val_lmer)$sigma$data, sigma.hat(nest_lmer)$sigma$data)), digits = 2)),
                           c("Level 1 sd", round(as.numeric(c(sigma.hat(val_lmer)$sigma$uid, sigma.hat(nest_lmer)$sigma$`uid:countryt`)), digits = 2)),
                           c("Level 2 sd", round(as.numeric(c(NA, sigma.hat(nest_lmer)$sigma$countryt)), digits = 2)),
                           c("Groups level 1", length(unique(val_lmer@flist[["uid"]])), length(unique(nest_lmer@flist[["uid:countryt"]]))),
                           c("Groups level 2", NA, length(unique(nest_lmer@flist[["countryt"]])))),
          column.labels = c("Valentino's all country", "MLM"), 
          model.numbers = F,
          model.names	= FALSE,
          font.size = "footnotesize", 
          column.sep.width = "1pt", 
          single.row = TRUE,
          no.space = TRUE,
          header=FALSE,
          intercept.bottom = FALSE, # intercepts at the top as in subsequent graph's table
          digits = 2, # limit coeff to 2 digits 
          table.placement = 'H',
          title = "Model comparison",
          covariate.labels = c("Intercept", "Job Status", "Family Status", "Middle Eastern",
                               "Complexion", "Vignette Position")) 
# modelsummary::modelsummary(list(val_lmer, nest_lmer))
```


```{r, eval=FALSE, include=FALSE}
# summary(val_lmer)
# summary(nest_lmer)

# extract sd of each level from each model
residual_sd <- round(as.numeric(c(sigma.hat(val_lmer)$sigma$data, sigma.hat(nest_lmer)$sigma$data)), digits = 4)
level1_sd <- round(as.numeric(c(sigma.hat(val_lmer)$sigma$uid, sigma.hat(nest_lmer)$sigma$`uid:countryt`)), digits = 4)
level2_sd <- round(as.numeric(c(NA, sigma.hat(nest_lmer)$sigma$countryt)), digits = 4)

# bind vectors in a single dataframe
models_sd <- rbind(residual_sd, level1_sd, level2_sd)
colnames(models_sd) <- c("Valentino's all", "MLM")
kable(models_sd)
```

A formal notation for both the intercept and error term may clarify where do the fixed and the random effects components of the model come from. 
Indeed, one can regard both the systematic components $\beta_{j}$ and the error term $\varepsilon_{ij}$ as being random variables assumed to have normal distribution. 
The intercept in the systematic component can be written as $\beta_{0j} = \gamma_{00} + \delta_{0j}$, wherein $\gamma_{00}$ is the mean intercept across all *j* clusters, and the random effect $\delta_{0j}$, which stands for the deviation of a given cluster *j* from the mean intercept across all *j* clusters $\gamma_{00}$ (@2018SMft). 
At an abstract level, they can be written as $\beta_{j} \sim \mathcal{N}(\gamma, \sigma_{\alpha}^2)$, and $\varepsilon_{ij} \sim \mathcal{N}(0, \sigma_y^2)$, respectively. 
This perspective enables us to understand that the original model by @ValentinoNicholasA2019EaCD had two main sources of stochastic variation in the dependent variable, one related to the *level 1 observations* (i.e., the vignettes), and another component stemming from the variation among the *level 2 grouping variables* (i.e., the respondents in the original paper), 
The *intraclass correlation (ICC)* ^['the fraction of total variation in the data that is accounted for by between-group variation. The intraclass correlation can also be thought of as the correlation among units within the same group. [...]. The greater the correlation among units within a group (that is, the bigger ICC is) the greater the impact on the standard error' @GelmanAndrew2006DAUR. ] captures the extent to which the variance can thus be apportioned between the various levels in a multilevel model. 


```{r, include=FALSE}
# ICC Valentino's model (with REML = T)
ICC_val_lmer <- as.numeric(sigma.hat(val_lmer)$sigma$uid / (sigma.hat(val_lmer)$sigma$uid + sigma.hat(val_lmer)$sigma$data))
ICC_val_lmer

# ICC Nested model
## Variation due to country variable
ICC_nest_lmer1 <- as.numeric(sigma.hat(nest_lmer)$sigma$countryt / (sigma.hat(nest_lmer)$sigma$`uid:countryt` + sigma.hat(nest_lmer)$sigma$countryt + sigma.hat(nest_lmer)$sigma$data))
ICC_nest_lmer1

## Variation due to respondents
ICC_nest_lmer2 <- as.numeric(sigma.hat(nest_lmer)$sigma$`uid:countryt` / (sigma.hat(nest_lmer)$sigma$`uid:countryt` + sigma.hat(nest_lmer)$sigma$countryt + sigma.hat(nest_lmer)$sigma$data))
ICC_nest_lmer2
```

Our results below indicate that while the authors' original model was able to account for about 0.68 of the variation in the data, our *MLM* model is able to apportion variation partly to the respondents (`r ICC_nest_lmer2`) and partly to countries (`r ICC_nest_lmer1`).

## Prediction from simulation

If we want to accurately portray both the stochastic and systematic uncertainty from our *MLM* model, we can simulate each of these two components.
One of the choices when simulating from a multilevel model is the selection of the very level we are predicting.
@GelmanAndrew2006DAUR offer a way to think about this choice in terms of '*Prediction for a new observation in an existing group*' or '*Prediction for a new observation in a new group*'.
Such choice entails simulating from distributions with different means and standard deviations. 
For instance, in the case of the former - i.e. 'Prediction for a new observation in an existing group' - @GelmanAndrew2006DAUR use the residual standard deviation, whereas in the case of the latter - namely, 'prediction for a new observation in a new group' - they use standard deviation of the level 1 grouping variable. 
In the case of our *MLM* model, we have a third standard deviation, which is related to the level 2 grouping variables, namely states. 
We focus on this source of variation as the added value of this model compared to the original by @ValentinoNicholasA2019EaCD is that we are able to include in a single model both variation within and between countries. 
To obtain confidence intervals from the simulations-based predicted probabilities, we set the covariates at median level, and pay attention to the difference between being shown a low-skilled versus high-skilled immigrants in the vignettes.
This is similar to what @ValentinoNicholasA2019EaCD do in their paper, where they plot the difference in the average treatment effect for their skill-related treatment.

```{r, echo=FALSE, fig.height=3, fig.width=6, message=FALSE, warning=FALSE}
library(GLMMadaptive) # this is necessary to get var covar matrix from lmer object
library(mvtnorm) # multivariate normal distribution
set.seed(1111)

# number of simulations
n_sims <- 100000

# extract matrix of X
X <- model.matrix(nest_lmer)
median_covariates1 <- apply(X, 2, median)
median_covariates0 <- median_covariates1
median_covariates0["Job_fHigh status"] <- 0 # changing skill covariate to low 

# pull out all the sigmas from the nested model - we're going to need them when simulating the stochastic component
sigma_a2_hat <- as.numeric(sigma.hat(nest_lmer)$sigma$countryt)
sigma_a1_hat <- as.numeric(sigma.hat(nest_lmer)$sigma[["uid:countryt"]])
sigma_y_hat <- as.numeric(sigma.hat(nest_lmer)$sigma$data)

# pull out the coefficients for both levels - fixed effects are going to be the same, only thing changing is the intercept
coef_hat_l2 <- as.matrix(coef(nest_lmer)$countryt)
# dim(coef_hat_l2)
coef_hat_l1 <- as.matrix(coef(nest_lmer)[["uid:countryt"]])
# dim(coef_hat_l1) 

# We need to convert the output of the vcov function into a matrix, as the default with lmer cannot then be used in the rmvnorm
sigma <- as.matrix(vcov(nest_lmer))

# initialise the matrix
sim_beta_country <- NULL

# loop over each country, get multivariate distribution to simulate uncertainty in he systematic component 
for (i in 1:nrow(coef_hat_l2)) {
  # always the same set of coeff - not sure whether we should change them and use fixef(nest_lmer), for instance - I opted to include coef_hat_l2 because this is by country - it would be preferable to have Gary's advice on this
  sim_b <- rmvnorm(n_sims, mean = coef_hat_l2[i,], sigma = sigma)  
  sim_beta_country <- rbind(sim_beta_country, sim_b)
}
# dim(sim_beta_country) # check dimensions of matrix

# is there variation in the means of intercept? 
## rough way to understand if we're getting this right: expectation is that the means of intercept should vary across states
df <- as.data.frame(sim_beta_country)
colnames(df) <- colnames(coef_hat_l2)
df$country <- rep(rownames(coef_hat_l2), each = n_sims)

# now simulate stochastic component
sim_mu0 <- sim_beta_country %*% as.matrix(median_covariates0) # high skill = 0
sim_mu1 <- sim_beta_country %*% as.matrix(median_covariates1) # high skill = 1
# dim(sim_mu0)

# simulate to capture uncertainty in the stochastic component - but with what sigma?
## here we pick the country level sd
sims0_Yhat <- rnorm(n_sims * dim(coef_hat_l2)[1], # number of simulation times country
                    mean = sim_mu0, sd = sigma_a2_hat)
sims1_Yhat <- rnorm(n_sims * dim(coef_hat_l2)[1], # number of simulation times country
                    mean = sim_mu1, sd = sigma_a2_hat)

# append the country identifier & merge into single df
df0_Yhat <- data.frame(country = rep(rownames(coef_hat_l2), each = n_sims),
                       sims = "sims0_Yhat",
                       y_hat = sims0_Yhat)
df1_Yhat <- data.frame(country = rep(rownames(coef_hat_l2), each = n_sims),   
                       sims = "sims1_Yhat",
                       y_hat = sims1_Yhat)
df_Yhat <- bind_rows(df0_Yhat, df1_Yhat)

# visualise
df_Yhat_plot <- df_Yhat %>%
  dplyr::group_by(country, sims) %>%
  dplyr::summarise(ci.025 = quantile(y_hat, c(0.025)),
                   ci.5 = quantile(y_hat, c(0.5)),
                   ci.975 = quantile(y_hat, c(0.975)))
# tibble to sort countries
t <- df_Yhat_plot %>%
  dplyr::filter(sims == "sims1_Yhat") %>%
  dplyr::arrange(desc(ci.5))
cntr_sort <- t$country  # vector of sorted countries to be fed to ggplot

# viz
df_Yhat_plot %>%
  mutate(countryt = factor(country, levels = cntr_sort),
         sims = factor(sims, 
                       levels = c("sims1_Yhat", "sims0_Yhat"),
                       labels = c("High", "Low"))) %>%
  ggplot() +
  geom_pointrange(aes(x = sims, y = ci.5, 
                      ymin = ci.025, ymax = ci.975)) +
  facet_wrap(~  countryt, nrow = 1) +
  labs(x = "Skill treatment", y = str_wrap("95% Confidence interval from simulations", width = 30),
       subtitle = str_wrap("Figure 2: Average treatment effect of job status by country, simulating from multilevel model", width = 80)) + 
  theme(axis.text.x = element_text(angle = 90, vjust = .5))
```

As a consequence of the simulations, we notice that - within each country - the confidence intervals for treatment and control tend to overlap, casting some doubts on the actual differences between the treatment effects. 
Further, we also see that - within each treatment - confidence intervals also overlap (even in the case of the two cases at the opposite side of the spectrum, i.e. South Korea (KR) and United Kingdom (UK)), again raising questions on the difference between countries. 
Overall, we also see that the confidence interval are much larger compared to those in @ValentinoNicholasA2019EaCD as a consequence of the propagation of uncertainty in both the systematic and stochastic components. 

Importantly, the reader should bear in mind that these results are conditional upon keeping the fixed slopes (something that could be changed in a different specification) and simulating from normal distribution with standard deviation equal to the standard deviation of our level 2 grouping variable. 
Future research may wish to assess whether changing these model specification and assumptions lead to different results. 


\section{Modeling the Spread in Immigration Support}
\label{sec:ex2-var}

The second way we add to existing literature is by modelling the standard deviation of immigrant support as a measure of "spread in attitudes towards immigrants" between the surveyed individuals within each country.
Our model provides us with two interesting insights - When looking at heterogeneity in support for immigrants, the features of the immigrant matters a lot more than the features of the respondents. 
Likewise, the spreads vary a lot more between different countries for certain kinds of immigrants. 

## Motivation and Model description

The multi-level structure of our dataset motivates us to use a multiple equations approach where we model the responses to the two vignettes separately which helps us to account for the correlation between the two responses from each individual in the survey. 
Each pair of responses is modeled as a draw from a bivariate normal distribution. 
We then reparametrize not only the mean, but also the standard deviations of the distribution as a function of our treatment covariates. 
The parameters to both the mean and standard deviation equations are then jointly estimated using Maximum Likelihood.

This is an alternative approach to the Panel Linear model estimated using generalized least squares (GLS) in the original paper. 
We make the additional assumption of normality to enable us to run simulations for the purpose of calculating interesting quantities of interest in the subsequent sections.
We work with the following model:\begin{center}
$\text{Support for Immigrant in first vignette by respondent i = }Y_{1i}$\
$\text{Support for Immigrant in second vignette by respondent i = }Y_{2i}$\
$\mathbf{Y_i} \sim \mathcal{N}(\mathbf{\mu_i}, \mathbf{\Sigma_i})$\
$\mathbf{\mu_i} = \begin{bmatrix}
\mu_{1i} \\
\mu_{2i}
\end{bmatrix}$\end{center}
Where,\
$\mu_{1i} = \beta_0 + \beta_1JobStatus_{1i} + \beta_2FamilyStatus_{1i} + \beta_3MiddleEastern_{1i} + \beta_4Complexion_{1i} + \beta_5VignettePosition_{1i}$\
$\mu_{2i} = \beta_0 + \beta_1JobStatus_{2i} + \beta_2FamilyStatus_{2i} + \beta_3MiddleEastern_{2i} + \beta_4Complexion_{2i} + \beta_5VignettePosition_{2i}$\
and,\begin{center}
$\mathbf{\Sigma_i} = \begin{bmatrix}
\sigma_{1i}^2 & \sigma_{12i} \\
\sigma_{12i} & \sigma_{2i}^2
\end{bmatrix}$\end{center}

We reparametrize the standard deviations,\
$\sigma_{1i} = \alpha_0 + \alpha_1JobStatus_{1i} + \alpha_2FamilyStatus_{1i} + \alpha_3MiddleEastern_{1i} + \alpha_4Complexion_{1i}$\
$\sigma_{2i} = \alpha_0 + \alpha_1JobStatus_{2i} + \alpha_2FamilyStatus_{2i} + \alpha_3MiddleEastern_{2i} + \alpha_4Complexion_{2i}$\
$\text{Within-respondent Correlation} = \rho_i = \frac{\sigma_{12i}}{\sigma_{1i}\sigma_{2i}}$

Our model successfully replicates the estimates from the authors' original model and produces the correct standard errors thereby reassuring us that our model is not mis-specified. 
Our approach also provides a flexible, software package-free framework to model various features of the distribution as functions of a wide variety of variables that might be available to the researcher. 
This might be very useful for testing certain hypotheses and calculating certain quantities of interest.

In our paper, we focus on two questions. 
Firstly, do these immigrant features (our treatments) impact the "spread" in attitudes towards immigrants amongst the individuals in a country? 
This spread in attitudes, measured as the standard deviation in immigrant support, is a key indicator of dispersion of opinions on certain kinds of immigrants in a population where a small "spread" might indicate a sort of consensus amongst the people about the desirability of the immigrant. 
Secondly, we shift focus to the question of how this spread as defined above differs between the different countries in our sample.    

## Results

Our model shows that the Job Status of the immigrant has a negative effect on the spread of support in five out of our ten country sample. 
Having a Middle Eastern background has a positive effect on spread in the US. 
That said, Figure 3 clearly illustrates that even though these effects are statistically significant, they are not very substantial. 
When compared to the effects of these treatments on the mean support, the effects of those same treatments on the spread are quite small.

The large intercept in the standard deviation (SD) equation implies that the treatments do not have much explanatory power for not only the spread, but also the mean support for the immigrants. 
It tells us that a large part of the variation in responses of different individuals is absorbed in the intercept, which remains unexplained by our treatment variables. 
It is clear that more data and more variables are needed to make more substantive arguments and inform better policy.

Our approach, which we believe extends beyond just this paper in its usefulness, can be a very useful way to evaluate proposed policy changes. 
A well-informed policy decision requires that attention is paid not only to how immigration attitudes will change, but also how the overall distribution of attitudes in a population changes. 

```{r, echo=FALSE,warning=FALSE,results='asis',message=FALSE}
load('../../data/interim/variance_data/restabs_var.rda')

country_list <- c("AU" ,"US", "CA", "UK", "NO", "CH", "KR", "JP", "FR", "DK")

dep.var <- "ImmigrantSupport"
regressors <- c('Constant','a','b','c','d')
vec.coeffs <- resdf1$AU
vec.se <- resdfse1$AU
vec.coeffs2 <- resdf1$US
vec.se2 <- resdfse1$US
vec.coeffs3 <- resdf1$CA
vec.se3 <- resdfse1$CA
vec.coeffs4 <- resdf1$UK
vec.se4 <- resdfse1$UK
vec.coeffs5 <- resdf1$NO
vec.se5 <- resdfse1$NO
vec.coeffs6 <- resdf1$CH
vec.se6 <- resdfse1$CH
vec.coeffs7 <- resdf1$KR
vec.se7 <- resdfse1$KR
vec.coeffs8 <- resdf1$JP
vec.se8 <- resdfse1$JP
vec.coeffs9 <- resdf1$FR
vec.se9 <- resdfse1$FR
vec.coeffs10 <- resdf1$DK
vec.se10 <- resdfse1$DK

d <- as.data.frame(matrix(rnorm(10 * 7), nc = 7))
names(d) <- c(dep.var, regressors)
f <- as.formula(paste(dep.var, "~ 0 +", paste(regressors, collapse = "+")))
p <- lm(f, d)

stargazer(p,p,p,p,p, type = "latex", 
          coef = list(vec.coeffs,
                      vec.coeffs2,
                      vec.coeffs3,
                      vec.coeffs4,
                      vec.coeffs5),
          se = list(vec.se,
                    vec.se2,
                    vec.se3,
                    vec.se4,
                    vec.se5),
          t = list(vec.coeffs/vec.se,
                   vec.coeffs2/vec.se2,
                   vec.coeffs3/vec.se3,
                   vec.coeffs4/vec.se4,
                   vec.coeffs5/vec.se5),
          column.labels = country_list[1:5],
          covariate.labels = c('Intercept','Job Status','Family Status','Middle Eastern',
                               'Complexion'),
          dep.var.labels = 'SD of Support for Immigrant',
          title = 'Model Results : Standard Deviation Equation',
          header = FALSE,
          omit.stat = "all",
          table.placement = 'H',
          no.space = TRUE,
          model.numbers = F,
          model.names	= FALSE,
          font.size = "footnotesize", 
          column.sep.width = "1pt", 
          single.row = TRUE,
          intercept.bottom = FALSE,
          omit.table.layout = "sn") # intercepts at the top as in subsequent graph's table

stargazer(p,p,p,p,p, type = "latex", 
          coef = list(vec.coeffs6,
                      vec.coeffs7,
                      vec.coeffs8,
                      vec.coeffs9,
                      vec.coeffs10),
          se = list(vec.se6,
                    vec.se7,
                    vec.se8,
                    vec.se9,
                    vec.se10),
          t = list(vec.coeffs6/vec.se6,
                   vec.coeffs7/vec.se7,
                   vec.coeffs8/vec.se8,
                   vec.coeffs9/vec.se9,
                   vec.coeffs10/vec.se10),
          column.labels = country_list[6:10],
          covariate.labels = c('Intercept','Job Status','Family Status','Middle Eastern',
                               'Complexion'),
          dep.var.labels = NULL,
          header = FALSE,
          omit.stat = "all",
          table.placement = 'H',
          title = 'Model Results : Standard Deviation Equation (Continued)',
          no.space = TRUE,
          model.numbers = F,
          model.names	= FALSE,
          font.size = "footnotesize", 
          column.sep.width = "1pt", 
          single.row = TRUE,
          intercept.bottom = FALSE)
```

```{r, echo=FALSE,fig.align='center',fig.height=8.9,fig.width=6,warning=FALSE,results='asis',message=FALSE}
# Loading and arranging the plots in a single page.
load(file='../../data/interim/variance_data/plotlist22_auth_comb.rda')
grid.arrange(plotlist2[[1]]+ylab('')+ theme(legend.position = "none"),
             plotlist2[[2]]+ylab('')+xlab('')+ theme(axis.text.y=element_blank()),
             plotlist2[[3]]+ylab('')+theme(legend.position = "none"),
             plotlist2[[4]]+ylab('')+xlab('')+ theme(axis.text.y=element_blank()),
             plotlist2[[5]]+ylab('')+ theme(legend.position = "none"),
             plotlist2[[6]]+ylab('')+xlab('')+ theme(axis.text.y=element_blank()),
             plotlist2[[7]]+ylab('')+ theme(legend.position = "none"),
             plotlist2[[8]]+ylab('')+xlab('')+ theme(axis.text.y=element_blank()),
             plotlist2[[9]]+ theme(legend.position = "none"),
             plotlist2[[10]]+xlab('')+ theme(axis.text.y=element_blank()),
             ncol=2,top = textGrob('Figure 3 : Parameter Estimates (Maximum Likelihood)',
                                   gp=gpar(fontsize=10)),
             bottom = textGrob(
               "The horizontal bars represent 95% confidence intervals",
               gp = gpar(fontface = 3, fontsize = 9),
               hjust = 1,
               x = 1))
```

## Differences in the spread of Immigrant Support across countries.

Now we look at how this spread in attitudes differs across countries. 
We simulate the between-respondent spread in support for immigrants of a specific kind (For example, High Job Status and No Family) for each country. 
We plot our results in Figure 4. 
Only the United States (US), United Kingdom (UK), France (FR), Japan (JP) and South Korea (KR) are retained to avoid overcrowding in the figure. 
For each plot, Middle Eastern and Complexion are set to $0$, which means the immigrant is not middle eastern and their complexion has not been darkened by the researchers. 
The figure plots the distribution of the results from $100,000$ simulations, which helps us visualize not only the point estimates of between-respondent standard deviation for each country, but also the estimation uncertainty of those estimates.

The figure shows an interesting pattern between the Western countries (US, UK and France) and the Eastern ones (Japan and South Korea). 
For a low job status immigrant with or without family, there is not a large difference between the countries. 
However, for immigrants with a high job status, the difference between Eastern and Western nations increases. 
A large part is due to the spread in South Korea becoming much smaller. 
For high job status immigrants with families, the spread is very similar amongst the western nations and is larger than than the spread in Japan and South Korea.

Even though our model helps us simulate how the spreads vary amongst countries, a more careful study into the reasons for such patterns might be warranted.

```{r, echo=FALSE,fig.align='center',warning=FALSE,fig.height=8.9}
load(file='../../data/interim/variance_data/plots_dense.rda')
grid.arrange(plot1,plot2,plot3,plot4,ncol=1,
             top = textGrob('Figure 4 : Between-respondent Spread in Support by Country and Immigrant type',
                                   gp=gpar(fontsize=10)))
```



# Conclusions

The purpose of this paper is to expand upon the current trend in research on immigration attitudes by switching from a focus on difference in means and measures of central tendencies, to an analysis of the variance in attitudes between and within countries.
In the context of division and fragmentation connected with opinions on immigration in many countries, it is of substantive interest to understand the degree to which such opinions actually vary.

The literature on immigration attitudes is expected to grow in the future. 
Our paper illustrates how future analyses in this domain can benefit by looking at measures of spread such as the standard deviation in addition to typical measures of central tendencies. 
By leveraging the nested structure of most datasets in this domain, simulating first differences for treatment variables, explicitly modeling the spread as a function of explanatory variables, and using simulations to calculate quantities of interest, a researcher can obtain interesting insights that might otherwise have been unexplored. 

After accounting for the nested structure and simulating first differences, we did not retrieve significant effects of the skill-based treatment within or between countries for each treatment group. 
We model the spread in immigration support between respondents and find that the skill-based treatment has a negligible negative effect on spread in many countries.
This implies that features of an immigrant do not play a large role in building consensus or division within a population regarding immigration attitudes. 
The countries in our sample have different amounts of spread among respondents. 
We used simulations to look at how the features of an immigrant affect this difference between the countries.
Our main recommendations concern both methods and substance. 
In terms of methods, we adopt a simulation based approach to estimate quantities of interest also in the case of survey experiments with complex modelling structure.
Even if the simulations are subject to assumptions and subjective decisions in terms of quantities of interest, they have the advantage of propagating uncertainties during estimation.
We encourage researchers to match our analysis and estimation of spread in public opinion in other policy domains.
On that comparative basis, we would be better positioned to understand whether our results are peculiar to attitudes towards immigration, or if they are in common with other domains.

